Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not close FDs 0, 1, or 2 #186

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

DemiMarie
Copy link
Contributor

If they are closed, another file descriptor could be created with these numbers, and so standard library functions that use them might write to an unwanted place. dup2() a file descriptor to /dev/null over them instead.

Copy link

codecov bot commented Jan 9, 2025

Codecov Report

Attention: Patch coverage is 51.21951% with 40 lines in your changes missing coverage. Please review.

Project coverage is 79.16%. Comparing base (b40d3da) to head (92cb1ef).

Files with missing lines Patch % Lines
agent/qrexec-agent.c 51.21% 40 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #186      +/-   ##
==========================================
- Coverage   79.17%   79.16%   -0.01%     
==========================================
  Files          54       54              
  Lines        9953     9989      +36     
==========================================
+ Hits         7880     7908      +28     
- Misses       2073     2081       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DemiMarie
Copy link
Contributor Author

Codecov appears to not be testing what happens in the child process after fork() and the error path of “cannot open /dev/null”.

@marmarek
Copy link
Member

AFAIR unit tests do not cover the PAM handling part, as they are not running as a system service, test runners don't have necessary PAM configuration etc.

@qubesos-bot
Copy link

qubesos-bot commented Jan 11, 2025

OpenQA test summary

Complete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025011809-4.3&flavor=pull-requests

Test run included the following:

New failures, excluding unstable

Compared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024111705-4.3&flavor=update

  • system_tests_dispvm

    • TC_20_DispVM_fedora-41-xfce: test_100_open_in_dispvm (failure)
      AssertionError: './open-file test.txt' failed with ./open-file test...
  • system_tests_kde_gui_interactive

    • gui_filecopy: unnamed test (unknown)
    • gui_filecopy: Failed (test died)
      # Test died: no candidate needle with tag(s) 'files-test-file' matc...
  • system_tests_whonix

    • whonixcheck: Failed (test died)
      # Test died: command 'curl --form upload=@whonixcheck-anon-whonix ....
  • system_tests_basic_vm_qrexec_gui@hw7

Failed tests

6 failures
  • system_tests_dispvm

    • TC_20_DispVM_fedora-41-xfce: test_100_open_in_dispvm (failure)
      AssertionError: './open-file test.txt' failed with ./open-file test...
  • system_tests_kde_gui_interactive

    • gui_filecopy: unnamed test (unknown)
    • gui_filecopy: Failed (test died)
      # Test died: no candidate needle with tag(s) 'files-test-file' matc...
  • system_tests_basic_vm_qrexec_gui_zfs

    • switch_pool: Failed (test died)
      # Test died: command 'qubes-dom0-update -y zfs' failed at /usr/lib/...
  • system_tests_whonix

    • whonixcheck: Failed (test died)
      # Test died: command 'curl --form upload=@whonixcheck-anon-whonix ....
  • system_tests_basic_vm_qrexec_gui@hw7

Fixed failures

Compared to: https://openqa.qubes-os.org/tests/119126#dependencies

3 fixed
  • system_tests_extra

    • TC_00_QVCTest_whonix-gateway-17: test_010_screenshare (failure)
      ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^... AssertionError: 0 == 0
  • system_tests_kde_gui_interactive

    • gui_keyboard_layout: Failed (test died)
      # Test died: command 'test "$(cd ~user;ls e1*)" = "$(qvm-run -p wor...
  • system_tests_audio@hw1

Unstable tests

@marmarek
Copy link
Member

system_tests_qrexec

* TC_00_Qrexec_debian-12-xfce: [test_053_qrexec_vm_service_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_debian-12-xfce/4) (failure)
  `AssertionError: Timeout, probably EOF wasn't transferred`

* TC_00_Qrexec_debian-12-xfce: [test_092_qrexec_service_socket_dom0_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_debian-12-xfce/18) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred fr...`

* TC_00_Qrexec_debian-12-xfce: [test_098_qrexec_service_socket_vm_eof](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_debian-12-xfce/23) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred to...`

* TC_00_Qrexec_fedora-41-xfce: [test_053_qrexec_vm_service_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_fedora-41-xfce/4) (failure)
  `AssertionError: Timeout, probably EOF wasn't transferred`

* TC_00_Qrexec_fedora-41-xfce: [test_092_qrexec_service_socket_dom0_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_fedora-41-xfce/18) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred fr...`

* TC_00_Qrexec_fedora-41-xfce: [test_098_qrexec_service_socket_vm_eof](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_fedora-41-xfce/23) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred to...`

* TC_00_Qrexec_whonix-gateway-17: [test_053_qrexec_vm_service_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-gateway-17/4) (failure)
  `AssertionError: Timeout, probably EOF wasn't transferred`

* TC_00_Qrexec_whonix-gateway-17: [test_083_qrexec_service_argument_specific_implementation](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-gateway-17/14) (error)
  `subprocess.CalledProcessError: Command '/usr/lib/qubes/qrexec-clien...`

* TC_00_Qrexec_whonix-gateway-17: [test_092_qrexec_service_socket_dom0_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-gateway-17/18) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred fr...`

* TC_00_Qrexec_whonix-gateway-17: [test_098_qrexec_service_socket_vm_eof](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-gateway-17/23) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred to...`

* TC_00_Qrexec_whonix-workstation-17: [test_053_qrexec_vm_service_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-workstation-17/4) (failure)
  `AssertionError: Timeout, probably EOF wasn't transferred`

* TC_00_Qrexec_whonix-workstation-17: [test_092_qrexec_service_socket_dom0_eof_reverse](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-workstation-17/18) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred fr...`

* TC_00_Qrexec_whonix-workstation-17: [test_098_qrexec_service_socket_vm_eof](https://openqa.qubes-os.org/tests/125357#step/TC_00_Qrexec_whonix-workstation-17/23) (failure)
  `AssertionError: service timeout, probably EOF wasn't transferred to...`

This is the only qrexec PR in this test run, so the above failures seems to be regression caused by this change.

@@ -162,6 +162,11 @@ void buffer_append(struct buffer *b, const char *data, int len);
void buffer_remove(struct buffer *b, int len);
int buffer_len(struct buffer *b);
void *buffer_data(struct buffer *b);
/* Open /dev/null and keep it from being closed before the exec func is called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it simpler (and safer) to simply open /dev/null just before dup-ing it over 0,1,2 (in the child process already)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s definitely simpler, but I didn’t want the extra call to open(). That said, giving each child process the same open file description is less than great, as it introduces shared state that should not be there. I don’t know if /dev/null has any such state, but still it isn’t great, so I went ahead and switched to your approach.

@marmarek
Copy link
Member

But also, I question usefulness of this PR as a whole - the closing of standard FDs happens in a process that has a single purpose - wait for the child process and then exit, in the very same function as closing happens. There are few PAM cleanup calls, but it's very unlikely for them to be a problem (especially, it isn't a problem now, or for the last 10 or so years).

@DemiMarie
Copy link
Contributor Author

Whether or not the last commit in the PR is merged, I definitely think the other commits should be merged. In particular, it turned out that the “close the FD” functionality had no unit tests because the unit tests took a codepath that was too different from the production code. This PR makes the production and test code follow the same path, with the result that the actual bug (/dev/null FD being closed by fix_fds()) is now caught. I think that this test improvement (and the other bug fixes) is itself useful.

There are few PAM cleanup calls, but it's very unlikely for them to be a problem (especially, it isn't a problem now, or for the last 10 or so years).

PAM cleanup calls into PAM modules, so it can do anything. I suspect Qubes OS only gets away with it because we have a fairly simple PAM stack by default. PAM cleanup is used for e.g. unmounting filesystems and closing encrypted volumes.

The best approach would be for PAM to run with stdin pointed at /dev/null and stdout and stderr pointed at the system log. The FDs would be fixed directly before executing the child process. That’s a bigger refactor, though.

@marmarek
Copy link
Member

marmarek commented Jan 12, 2025

Whether or not the last commit in the PR is merged,

Indeed I was talking about the last commit (which until the last force-push was the only commit in this PR).

PAM cleanup calls into PAM modules, so it can do anything. I suspect Qubes OS only gets away with it because we have a fairly simple PAM stack by default.

Aren't PAM modules expected handle proper logging themselves? I don't think they are supposed to touch calling process's stdin/out/err in any case. And if they would do, that likely would interfere also with cases where they aren't closed (and then replaced with with unrelated thing) - for example it could interfere with an application log file on stderr that is expected in a specific format (different than PAM messages).

@marmarek
Copy link
Member

As for the other commits, won't that have some non-trivial conflicts with #141 (which I hope is quite close to merge-able state)?

@DemiMarie
Copy link
Contributor Author

As for the other commits, won't that have some non-trivial conflicts with #141 (which I hope is quite close to merge-able state)?

I can include them in #141 or rebase this PR on top of it. I can also close this PR if you prefer, but I’d prefer that at least the bug fixes and testability changes go in.

This is the convention used by the rest of qrexec.  This commit should
be backported to stable branches.
These should never happen, but call exit() if they do.
Saves an (admittedly cheap) system call.
No functional change intended.
This will be used by tests later.

No functional change intended.
This will be used by tests later.  No functional change intended.
This also fixes a bug: basename can mutate its argument, so a copy must
be passed to it.
This makes the unit test code more like the actual code used by
end-users, and therefore makes the tests more accurate.
If they are closed, another file descriptor could be created with these
numbers, and so standard library functions that use them might write to
an unwanted place.  dup2() a file descriptor to /dev/null over them
instead.

Also statically initialize trigger_fd to -1, which is the conventional
value for an invalid file descriptor.

This requires care to avoid closing the file descriptor to /dev/null in
fix_fds(), which took over an hour to debug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants